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Status of this Memo 


This document specifies an Internet standards track protocol for the 
Internet community, and requests discussion and suggestions for 


improvements. Please refer to the current edition of the "Internet 

Official Protocol Standards" (STD 1) for the standardization state 

and status of this protocol. Distribution of this memo is unlimited. 
Abstract 


The COMPRESS extension allows an IMAP connection to be effectively 
and efficiently compressed. 
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Iz 


Introduction and Overview 


A server which supports the COMPRESS extension indicates this with 
one or more capability names consisting of "COMPRESS=" followed by a 
supported compression algorithm name as described in this document. 


The goal of COMPRESS is to reduce the bandwidth usage of IMAP. 


Compared to PPP compression (see [RFC1962]) and modem-based 
compression (see [MNP] and [V42BIS]), COMPRESS offers much better 
compression efficiency. COMPRESS can be used together with Transport 
Security Layer (TLS) [RFC4346], Simple Authentication and Security 
layer (SASL) encryption, Virtual Private Networks (VPNs), etc. 
Compared to TLS compression [RFC3749], COMPRESS has the following 
(dis)advantages: 


- COMPRESS can be implemented easily both by IMAP servers and 
clients. 


- IMAP COMPRESS benefits from an intimate knowledge of the IMAP 
protocol’s state machine, allowing for dynamic and aggressive 
optimization of the underlying compression algorithm’s parameters. 


- When the TLS layer implements compression, any protocol using that 
layer can transparently benefit from that compression (e.g., SMTP 
and IMAP). COMPRESS is specific to IMAP. 


In order to increase interoperation, it is desirable to have as few 
different compression algorithms as possible, so this document 
specifies only one. The DEFLATE algorithm (defined in [RFC1951]) is 
standard, widely available and fairly efficient, so it is the only 
algorithm defined by this document. 


In order to increase interoperation, IMAP servers that advertise this 
extension SHOULD also advertise the TLS DEFLATE compression mechanism 
as defined in [RFC3749]. IMAP clients MAY use either COMPRESS or TLS 
compression, however, if the client and server support both, it is 
RECOMMENDED that the client choose TLS compression. 


The extension adds one new command (COMPRESS) and no new responses. 
Conventions Used in This Document 

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", “SHALL NOT", 
"SHOULD", “SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 


document are to be interpreted as described in [RFC2119]. 


Formal syntax is defined by [RFC4234] as modified by [RFC3501]. 
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In the examples, "C:" and "S:" indicate lines sent by the client and 
server respectively. "[...]" denotes elision. 


3. The COMPRESS Command 
Arguments: Name of compression mechanism: "DEFLATE". 
Responses: None 


Result: OK The server will compress its responses and expects the 
client to compress its commands. 
NO Compression is already active via another layer. 
BAD Command unknown, invalid or unknown argument, or COMPRESS 
already active. 


The COMPRESS command instructs the server to use the named 
compression mechanism ("DEFLATE" is the only one defined) for all 
commands and/or responses after COMPRESS. 


The client MUST NOT send any further commands until it has seen the 
result of COMPRESS. If the response was OK, the client MUST compress 
starting with the first command after COMPRESS. If the server 
response was BAD or NO, the client MUST NOT turn on compression. 


If the server responds NO because it knows that the same mechanism is 
active already (e.g., because TLS has negotiated the same mechanism), 
it MUST send COMPRESSIONACTIVE as resp-text-code (see [RFC3501], 
Section 7.1), and the resp-text SHOULD say which layer compresses. 


If the server issues an OK response, the server MUST compress 
starting immediately after the CRLF which ends the tagged OK 
response. (Responses issued by the server before the OK response 
will, of course, still be uncompressed.) If the server issues a BAD 
or NO response, the server MUST NOT turn on compression. 


For DEFLATE (as for many other compression mechanisms), the 
compressor can trade speed against quality. When decompressing there 
isn’t much of a tradeoff. Consequently, the client and server are 
both free to pick the best reasonable rate of compression for the 
data they send. 


When COMPRESS is combined with TLS (see [RFC4346]) or SASL (see 
[RFC4422]) security layers, the sending order of the three extensions 
MUST be first COMPRESS, then SASL, and finally TLS. That is, before 
data is transmitted it is first compressed. Second, if a SASL 
security layer has been negotiated, the compressed data is then 
signed and/or encrypted accordingly. Third, if a TLS security layer 
has been negotiated, the data from the previous step is signed and/or 
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encrypted accordingly. When receiving data, the processing order 
MUST be reversed. This ensures that before sending, data is 
compressed before it is encrypted, independent of the order in which 
the client issues COMPRESS, AUTHENTICATE, and STARTTLS. 


The following example illustrates how commands and responses are 
compressed during a simple login sequence: 


* OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE] 
a starttls 
a OK TLS active 


nan 


From this point on, everything is encrypted. 


b login arnt tnra 

b OK Logged in as arnt 
c compress deflate 

d OK DEFLATE active 


NAVNA 


From this point on, everything is compressed before being 
encrypted. 


The following example demonstrates how a server may refuse to 
compress twice: 


S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE] 
eres | 

C: c compress deflate 

S: c NO [COMPRESSIONACTIVE] DEFLATE active via TLS 


4. Compression Efficiency 
This section is informative, not normative. 
IMAP poses some unusual problems for a compression layer. 
Upstream is fairly simple. Most IMAP clients send the same few 
commands again and again, so any compression algorithm that can 
exploit repetition works efficiently. The APPEND command is an 
exception; clients that send many APPEND commands may want to 
surround large literals with flushes in the same way as is 


recommended for servers later in this section. 


Downstream has the unusual property that several kinds of data are 
sent, confusing all dictionary-based compression algorithms. 
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One type is IMAP responses. These are highly compressible; zlib 
using its least CPU-intensive setting compresses typical responses to 
25-40% of their original size. 


Another type is email headers. These are equally compressible, and 
benefit from using the same dictionary as the IMAP responses. 


A third type is email body text. Text is usually fairly short and 
includes much ASCII, so the same compression dictionary will doa 
good job here, too. When multiple messages in the same thread are 
read at the same time, quoted lines etc. can often be compressed 
almost to zero. 


Finally, attachments (non-text email bodies) are transmitted, either 
in binary form or encoded with base-64. 


When attachments are retrieved in binary form, DEFLATE may be able to 
compress them, but the format of the attachment is usually not IMAP- 
like, so the dictionary built while compressing IMAP does not help. 
The compressor has to adapt its dictionary from IMAP to the 
attachment’s format, and then back. A few file formats aren’t 
compressible at all using deflate, e.g., .gz, .zip, and .jpg files. 


When attachments are retrieved in base-64 form, the same problems 
apply, but the base-64 encoding adds another problem. 8-bit 
compression algorithms such as deflate work well on 8-bit file 
formats, however base-64 turns a file into something resembling 6-bit 
bytes, hiding most of the 8-bit file format from the compressor. 


When using the zlib library (see [RFC1951]), the functions 
deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to 
implement this extension. The windowBits value must be in the range 
-8 to -15, or else deflateInit2() uses the wrong format. 
deflateParams() can be used to improve compression rate and resource 
use. The Z_FULL_FLUSH argument to deflate() can be used to clear the 
dictionary (the receiving peer does not need to do anything). 


A client can improve downstream compression by implementing BINARY 
(defined in [RFC3516]) and using FETCH BINARY instead of FETCH BODY. 
In the author’s experience, the improvement ranges from 5% to 40% 
depending on the attachment being downloaded. 


A server can improve downstream compression if it hints to the 
compressor that the data type is about to change strongly, e.g., by 
sending a Z_FULL_FLUSH at the start and end of large non-text 
literals (before and after ’*CHAR8’ in the definition of literal in 
RFC 3501, page 86). Small literals are best left alone. A possible 
boundary is 5k. 
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A server can improve the CPU efficiency both of the server and the 
client if it adjusts the compression level (e.g., using the 
deflateParams() function in zlib) at these points, to avoid trying to 
compress incompressible attachments. A very simple strategy is to 
change the level to 0 at the start of a literal provided the first 
two bytes are either 0x1F 0x8B (as in deflate-compressed files) or 
OxFF OxD8 (JPEG), and to keep it at 1-5 the rest of the time. More 
complex strategies are possible. 


5. Formal Syntax 


The following syntax specification uses the Augmented Backus-Naur 
Form (ABNF) notation as specified in [RFC4234]. This syntax augments 
the grammar specified in [RFC3501]. [RFC4234] defines SP and 
[RFC3501] defines command-auth, capability, and resp-text-—code. 


Except as noted otherwise, all alphabetic characters are case- 
insensitive. The use of upper or lower case characters to define 
token strings is for editorial clarity only. Implementations MUST 
accept these strings in a case-insensitive fashion. 


command-auth =/ compress 


compress "COMPRESS" SP algorithm 


capability =/ "COMPRESS=" algorithm 
7; multiple COMPRESS capabilities allowed 


"DEFLATE" 


algorithm 
resp-text-code =/ "COMPRESSIONACTIVE" 


Note that due the syntax of capability names, future algorithm names 
must be atoms. 


6. Security Considerations 
As for TLS compression [RFC3749]. 
7. IANA Considerations 


The IANA has added COMPRESS=DEFLATE to the list of IMAP capabilities. 
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